Analysis of the hikikomori phenomenon – an international infodemiology study of Twitter data in Portuguese

Background Hikikomori refers to the extreme isolation of individuals in their own homes, lasting at least six months. In recent years social isolation has become an important clinical, social, and public health problem, with increased awareness of hikikomori around the globe. Portuguese is one of the six most spoken languages in the world, but no studies have analysed the content regarding this phenomenon expressed in Portuguese. Objective To explore the hikikomori phenomenon on Twitter in Portuguese, utilising a mixed-methods approach encompassing content analysis, emotional analysis, and correlation analysis. Methods A mixed methods analysis of all publicly available tweets in the Portuguese language using a specific keyword (hikikomori) between 1st January 2008 and 19th October 2022. The content analysis involved categorising tweets based on tone, content, and user types, while correlation analysis was used to investigate user engagement and geographical distribution. Statistical analysis and artificial intelligence were employed to classify and interpret the tweet data. Results Among the total of 13,915 tweets generated, in terms of tone 10,731 were classified as “negative”, and 3184 as “positive”. Regarding content, “curiosities” was the most posted, as well as the most retweeted and liked topic. Worldwide, most of the hikikomori related tweets in Portuguese were posted in Europe, while “individuals with hikikomori” were the users most active posting. Regarding emotion analysis, the majority of tweets were “neutral”. Conclusions These findings show the global prevalence of the discourse on hikikomori phenomenon among Portuguese speakers. It also indicates an increase in the number of tweets on this topic in certain continents over the years. These findings can contribute to developing specific interventions, support networks, and awareness-raising campaigns for affected individuals.


Background
Hikikomori is a Japanese concept that emerged in the last decades of the twentieth century and refers to extreme withdrawal by some people in their own homes, lasting at least six months [1].This isolation usually begins in adolescence or early adulthood and affects a higher proportion of males [2].Hikikomori is characterised by a lack of participation in education, work, and other daily life activities, resulting in profound suffering or functional impairment [3][4][5].Hikikomori impacts not only individuals who suffered from this condition, but also their families who desperately try to help them with no success [6,7].
Whilst the internet is increasingly present in our lives, and therefore, people are more digitally connected, this may sometimes lead to isolation and increased loneliness [8].There may be different situations why people isolate themselves on the internet, from online gaming to social networking, as well as family problems and difficulties of inclusion in the school of young people [9,10].The COVID-19 pandemic has also led to an increased awareness of social isolation.Studies in different countries, such as Brazil, Portugal, England, China, Iran, Malaysia, Pakistan, Philippines, Thailand, and Vietnam, showed that the pandemic affected mental health, leading to anxiety, depression, and stress, which resulted in social isolation [11][12][13][14].The pandemic brought many social changes, from mandatory physical distancing to quarantines, which may have reinforced hikikomori-prone behaviour in people with mental health conditions [15,16].
Infodemiology is the science that includes the collection, analysis and interpretation of health-related information in electronic media (internet and other digital sources) with the aim of informing public health and public policy [17,18].The internet is essential for sharing opinions, knowledge, and problems, and online social networks allow new forms of (online) communication and information sharing [19].Twitter stands out as a prominent social network, valued for its short messages with a limit of 280 characters (currently up to 25,000 for subscribers of the premium version), with the possibility of anonymity [19][20][21][22].Users can also repost content from other users (retweet), which makes them reach a larger population, sometimes marking their tweets with hashtags to identify a theme and allow other users to see related tweets [23,24].Recent research has embraced the analysis of tweets to explore, identify, and reach individuals with particular characteristics of interest to seek their perspectives, which otherwise would be marginalised or excluded.Twitter is a helpful tool to investigate the hikikomori phenomenon since those affected often use social media as an online 'refuge' [1,25].Previous studies of hikikomori using infodemiology explored contents and perceptions related to hikikomori on Twitter in Japanese and some Western languages, finding personal stories as the most posted content, and several mentions of hikikomori in non-Japanese Western languages [7,8].Understanding the content of tweets related to the hikikomori phenomenon provides valuable information on how this social issue is perceived and discussed among different people.
Although hikikomori was considered a typical phenomenon related to Japanese culture [2,26,27], over the years different studies have been conducted to investigate this phenomenon in other countries [1,7,28,29].While Portuguese is the sixth most spoken language in the world, to this day, no studies have analysed this phenomenon in this language.
The main aim of this study has been to explore contents related to the hikikomori phenomenon in the Portuguese language on Twitter.In particular, this study addressed the following research questions: (1) How do Twitter users express their views on hikikomori and what emotions do they associate with it?(2) Has the frequency of tweets about the hikikomori phenomenon changed over time, including during the COVID-19 pandemic?(3) What type of content related to hikikomori generates more interest on Twitter? (4) Are there any geographical differences in Portuguese-speaking countries regarding the hikikomori phenomenon tweets?

Research strategy
The research strategy focused on the collection and content analysis of Portuguese-language tweets about hikikomori.We included tweets that met the following criteria: (1) public (not private/protected) tweets; (2) use the keyword hikikomori; (3) tweets posted between 1st January 2008, and 19th October 2022; and (4) text in Portuguese language.The exclusion criteria were: (1) the majority of the text in the tweet in a language other than Portuguese; (2) tweets with only a link or image without any text.The tool used for collecting tweets was Tweet Binder, which has been widely used in previous research and provides access to 100% of public tweets [30,31].This tool provides the tweet text, count of retweets and likes for each tweet, as well as the date of publication, a link to the tweet in its context, user description, and geolocation data (obtained from the biography [bio] of the account that published the tweet).Regarding geolocation, we have gathered data at the country level.To enable comparison of the number of tweets posted in each country, we have grouped them into five continents: Americas (North, Central and South America), Europe, Africa, Asia and Oceania.The number of retweets and likes generated by each tweet were analysed as an indicator of user interest in a given topic.

Content analysis process
All tweets in our database were analysed using a content analysis procedure that consisted of creating codes and categories.We created a codebook to characterise each tweet, where each tweet was analysed according to the categories including: i) whether content was from a positive (tweets that express solidarity, self-disclosure, encouragement, gratitude, enthusiasm and pride) or negative perspective (tweets that express blame, stigma, and negative opinions); ii) whether it was about information relating to the hikikomori phenomenon; iii) whether it was an account of personal stories, or whether it was about curiosities of the phenomenon.Table 1 provides a detailed characterisation of the categories and examples of tweets that fall into these categories.The Twitter users were classified into three users types: "individuals with hikikomori" (people who describe themselves as hikikomori), "family and friends" (including family members and close friends or acquaintances), and "others" (unspecified as not fitting the previous types).

Application of artificial intelligence to evaluate tweets
Recent technological advancements have led to the development of artificial intelligence (AI), including machine learning (ML) [32] and deep learning (DL) [33].Neural networks, inspired by human brain neurons, are extensively used in various applications, such as natural language processing (NLP) [34], weather prediction [35], coronavirus detection [36], and image object detection [37].In this study, a pre-trained neural network called BERTWEET [38], trained on 850 million English tweets, was employed to classify hikikomori-related tweets into different categories.
Before applying the BERTWEET network, the tweet database underwent preprocessing steps.Non-English tweets were translated into English using Google Translator since the network is trained only on English tweets.Previous studies have demonstrated that employing Google Translator to translate text into English and subsequently using a model trained on English tweets can enhance the performance of machine learning models [39][40][41].The tweets were then normalised by removing special characters, separating negative tenses, and eliminating repeated characters.As BERTWEET was not originally trained to classify the desired categories, a process called fine-tuning was conducted.The manually classified tweets were randomly split into training and testing subsets, with 75% used for fine-tuning the network and 25% for validation.We validated the correct performance of the model in the validation set by computing the weighted F1 score, consistently achieving a score above 0.8 across all categories.This methodology, previously employed with positive outcomes [42], was employed to ensure the fine-tuned BERTWEET model performed well on the database.The fine-tuned BERTWEET model was then used to categorise the remaining tweets that had not been manually classified.
Additionally, the emotions expressed in the tweets were analysed using a pre-trained neural network called emotion-english-distilroberta-base [43].This network, capable of detecting Ekman's six basic emotions plus neutral, was applied to the translated and normalised dataset of 13,915 tweets [44].The emotion-english-distilrobertabase model, previously used in other research studies, does not require additional fine-tuning as it serves the same purpose for which it was originally trained [45,46].The number of likes and retweets each tweet generated, the date and time of each tweet, a permanent link to the tweet, and a description of each user's profile were collected.The nature of users who posted tweets was determined according to the available information (tweet content, description of the user's profile, or Twitter identifier).

Statistical analysis
Descriptive statistics were used to summarise tweets, likes, and retweets of users related to the content topics, tone, user type, emotions, and location.Correlation coefficients were determined to measure the strength and direction of their association.We also investigated the number of tweets and retweets generated by Twitter users by the days of the week grouped by user type.Additionally, we conducted an emotional analysis to graphically represent Twitter users' emotions by the type of user; time trends were used to describe the number of all tweets posted by continents between 2008-2022 as well as to illustrate the number of tweets posted before and after the COVID-19 pandemic; lastly, the proportions of Tweets posted by type of users across continents were graphically represented.All analyses were performed with STATA version 15 (StataCorp LP).

Ethical considerations
This study used publicly available tweets.This study received approval from the Ethics Committee of the Biomedical Sciences Institute Abel Salazar at the University of Porto (Ref 2023/CE/P03/(P401/CETI/ICBAS)).
Regarding the world regions, the majority of tweets (n = 12,400; 89.1%) did not report their location ("Missing").Among the tweets with available location, the greater proportion of tweets was posted in "Europe" (n = 553, 4.0%).
There was a total of 21,307 likes and 5,359 retweets of hikikomori-related phenomenon in Portuguese.Regarding content, "curiosities" showed the highest proportion of the number of likes and retweets.Tweets posted in a positive way have more likes and retweets than tweets posted in a negative way.The "others" type of user presented a higher proportion of number of likes and retweets and, and "disgust" feelings was the sentiment with the most likes and retweets."Europe" was the continent with the highest number of likes and retweets of hikikomori-related tweets in Portuguese-language.
The full details of the frequency of these tweets are reported in Table 2.

Emotion analysis
Figure 1 displays the Twitter users' emotions by type of user and reports that in all user types, the majority of tweets were "neutral", whereas "disgust" emotion was the least frequent.Among the "individuals with hikikomori" users, the second most frequent emotion was "sadness", whilst in the "family and friends" the second most frequent emotion was "surprise", and in "others" types of users the second most prevalent emotion was "joy".

Number of tweets over time (2008-2022)
Over the years (Fig. 2), there was a progressive increase in the number of tweets about hikikomori in Portuguese on all continents except in Oceania, where tweets were only retrieved since 2018 and where there are fewer tweets.
Particularly in the Americas, Europe, and Asia, there was a higher increase in the number of tweets as the years passed.
Regarding the COVID-19 pandemic period, a difference in the number of tweets posted before and after the pandemic can graphically be observed.During and after the pandemic the number of tweets was higher (Fig. 3).

Continents
In Africa, Asia and Oceania, "individuals with hikikomori" were the type of user posting the most tweets.In America, there was an equal distribution of tweets posted by "individuals with hikikomori" and "others".In Europe, the majority of tweets were posted by "family and friends" (Fig. 4).

Key findings
The number of likes and retweets retrieved reflects an interest in the hikikomori phenomenon in Portuguesespeaking Twitter users.Regarding the content, tweets about curiosities were the most frequent, with the majority reporting hikikomori as something negative.Concerning the user type, "individuals with hikikomori" posted more tweets about the story of their day (personal stories), while for "family and friends" of people with hikikomori, curiosities was the topic that got more attention, and "others" posted more tweets about the concept of hikikomori itself.Among all the types of users, "individuals with hikikomori" were the most frequent.In terms of emotions, most tweets were neutral, with the least frequent emotion being disgust.
The knowledge of the hikikomori phenomenon in Portuguese language is globally spread, with users in Europe posting the most tweets and users in Oceania the least.The distribution of the different user types was relatively

Comparison with the other literature
A study conducted with psychiatrists from several countries, supported the notion that the hikikomori phenomenon is global [27,47].Consistent with our findings, a previous study [7] identified Twitter contents suggesting the existence of hikikomori in Western countries based on the languages analysed (Catalan, English, French, Italian, and Spanish).The number of tweets posted from different continents has increased significantly over the years.This phenomenon may be explained, on the one hand, as people are using Twitter more to express their thoughts and feelings, and on the other hand, because the hikikomori phenomenon is becoming worldwide a more known and trending topic [1,25,47].A study exploring the general population's attitudes toward prison volunteering, demonstrated that users posted more tweets in Portuguese about volunteering in prisons in America (Ferrão Nunes et al.: Public discourse towards volunteering in prisons: An infodemiology study of Twitter data, submitted), whereas our study revealed that users from Europe posted the most.This may be related to the fact that people from Europe are more familiar with the term hikikomori than people from other continents.
While a previous study [8] of tweets about hikikomori in Japanese showed that personal stories were the most posted content, in our study, curiosities about hikikomori were predominant, suggesting that the hikikomori phenomenon has generated more interest among people since the users who posted the tweets know some curiosities about this phenomenon.
The interest of users in a specific topic can be assessed by analysing the number of retweets and likes that each tweet generates [48].A study [8] that explored this phenomenon by analysing tweets had more likes than retweets, similar to our study.This could be attributed to the fact that people are more comfortable to show their interest in the subject, but less inclined to actively share the tweets for wider dissemination.
Two previous studies [7,8] have additionally shown that the majority of tweets have reported hikikomori as a problem, which corroborates in the findings of our study, suggesting widespread awareness of the seriousness of the phenomenon.

Implication of the findings for future policies and research
Twitter is a helpful tool to study the hikikomori phenomenon since the affected individuals often use social networks as a refuge [1,25].Internet platforms allow us to reach socially withdrawn youth, and in this way, warn of risks and provide support.
This study suggests, similarly to other studies, that the hikikomori phenomenon is not only restricted to Japan [7,8,27,47]; it specifically shows that the phenomenon is also known and spoken about among Portuguese speakers all over the world.
This study provides valuable insights into the hikikomori phenomenon on Twitter in Portuguese, offering implications for future research and policies aimed at addressing this complex social issue.These findings are a call to action to further investigate and develop methods to assist individuals with hikikomori who have been identified through the Twitter platform.The identification of individuals with hikikomori tendencies in this study underscores the need to provide them with the necessary support and resources to help these individuals seek help.Future research could focus on effective intervention and support strategies tailored to the needs of individuals affected by hikikomori.Exploring the potential of social media platforms like Twitter to provide psychosocial support and establish a good support network for individuals affected by hikikomori, as well as their families and friends, encouraging professional help as well, is another area for future investigation.Understanding how these platforms can be utilised to promote mental health and well-being in the context of hikikomori is crucial for developing effective support systems.It is also essential to promote education on the subject, alerting the public to its risks and consequences, and raising awareness of ways to prevent its development.
Future studies could consider conducting comparative analyses by examining data from Twitter and other social media platforms in other languages and countries where the hikikomori phenomenon has not yet been researched.This approach would provide a more comprehensive understanding of the global impact of hikikomori and its manifestations in different cultural contexts.

Strengths and limitations
This is the first study to investigate tweets related to hikikomori in the Portuguese language and to investigate its geographical differences.However, this study has some limitations.First, using a specific keyword may have limited our collection of tweets, since users may have used another term to refer to the hikikomori phenomenon, and thus tweets that refer to it in a different way may not have been collected.Furthermore, the potential deletion of tweets over time, whether by the social network, users, or through account deletion or protection, may have resulted in the exclusion of valuable data from the analysis, potentially impacting the comprehensiveness of the study findings.Additionally, the increase in tweets observed after the pandemic may not be related to users' interest in the topic.Therefore, it may not be conclusive that the pandemic has significantly impacted the level of interest in the hikikomori phenomenon compared to other topics.Another limitation of the study is that a significant portion of the tweets lacked geographical location data, limiting the ability to gain insights into regional variations of the hikikomori phenomenon among Portuguese speakers.Finally, whilst we analysed the number of retweets and likes generated by each tweet as an indicator of user interest in a given topic, these engagement metrics can be influenced by other factors and thus may not truly match user interest.

Conclusions
These findings demonstrate that the hikikomori phenomenon is actively discussed and appears to be prevalent among Portuguese speakers on all continents of the world.This supports and enhances our understanding of the globalisation of this phenomenon.The majority of tweets seemed to originate from individuals apparently affected by hikikomori.They reported curiosities about the phenomenon and discussed it in a negative light, suggesting that most people perceive it as a problem that needs to be addressed.Tweets about hikikomori in the Americas, Asia, and Europe have been steadily increasing in recent years, without a noticeable surge during the COVID-19 pandemic.This suggests that the prevalence of hikikomori is gradually growing in these continents.This study underscores the need to provide individuals with hikikomori tendencies with the necessary support and resources to improve their quality of life.It also offers implications for future research and policies addressing this complex social issue on Twitter.

Hikikomori
Term used to describe the social phenomenon of individuals, often young adults, who withdraw from social life and isolate themselves at home for a prolonged period, avoiding social interaction.

Infodemiology
Refers to the study of how information spreads and impacts public health, especially on the internet.Analyses patterns of information disclosure to understand how it affects people's knowledge and behaviour.Machine Learning Subset of artificial intelligence that involves the development of algorithms and statistical models that enable computers to perform tasks without explicit programming.It relies on patterns and inference from data to improve performance over time.

Deep Learning
A specialised area of machine learning that involves using neural networks with multiple layers (deep neural networks) to model and solve complex problems.It is particularly effective in tasks such as image and speech recognition.BERTweet Natural language processing model based on BERT (Bidirectional Encoder Representations from Transformers) architecture, specifically tailored for processing and understanding Twitter data.It enhances language understanding in a social media context.Emotion-English-Distilroberta-Base Pre-trained language model based on the Distilroberta architecture, specialised in recognizing and understanding emotions in English text.It is trained to analyse and categorise emotions expressed in written c ont ent .

Fig. 1
Fig. 1 Proportion of type of user by considered emotions of tweets

Fig. 2
Fig. 2 Temporal evolution of tweets published from 2008 to 2022 by continents

Fig. 3
Fig. 3 Time trend of all tweets posted 2 years before and 2 years after the beginning of the COVID-19 pandemic (1st March 2020).In this figure, the line represents the frequency of tweets posted by users

Fig. 4
Fig. 4 Proportion of tweets published by each user type (Y-axis) according to the continent (X-axis)

Table 2
Number of tweets, retweets, and likes related to the content topics, tone, user type, emotions, and location

Table 3
Content topics related to tweets classified according to the type of user, emotions, continents, and tone